The Value of Prior Knowledge in Discovering Motifs with MEME

نویسندگان

Timothy L. Bailey

Charles Elkan

چکیده

MEME is a tool for discovering motifs in sets of protein or DNA sequences. This paper describes several extensions to MEME which increase its ability to find motifs in a totally unsupervised fashion, but which also allow it to benefit when prior knowledge is available. When no background knowledge is asserted. MEME obtains increased robustness from a method for determining motif widths automatically, and from probabilistic models that allow motifs to be absent in some input sequences. On the other hand, MEME can exploit prior knowledge about a motif being present in all input sequences, about the length of a motif and whether it is a palindrome, and (using Dirichlet mixtures) about expected patterns in individual motif positions. Extensive experiments are reported which support the claim that MEME benefits from, but does not require, background knowledge. The experiments use seven previously studied DNA and protein sequence families and 75 of the protein families documented in the Prosite database of sites and patterns, Release 11.1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixture Model based MAP Motif Discovering

In this paper a new maximum a posteriori (MAP) approach based on mixtures of multinomials is proposed for discovering probabilistic motifs in sequences. The main advantage of the proposed methodology is the ability to bypass the problem of overlapping motif occurrences among neighborhood positions in sequences through the use of a Markov Random Field (MRF) as a prior. This model consists of two...

متن کامل

Discovering Motifs With Transcription Factor Domain Knowledge

We introduce a new motif-discovery algorithm, DIMDom, which exploits two additional kinds of information not commonly exploited: (a) the characteristic pattern of binding site classes, where class is determined based on biological information about transcription factor domains and (b) posterior probabilities of these classes. We compared the performance of DIMDom with MEME on all the transcript...

متن کامل

MEME: discovering and analyzing DNA and protein sequence motifs

MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME s...

متن کامل

A sequential method for discovering probabilistic motifs in proteins.

OBJECTIVES This paper proposes a greedy algorithm for learning a mixture of motifs model through likelihood maximization, in order to discover common substrings, known as motifs, from a given collection of related biosequences. METHODS The approach sequentially adds a new motif component to a mixture model by performing a combined scheme of global and local search for appropriately initializi...

متن کامل

MEME-ChIP: motif analysis of large DNA datasets

MOTIVATION Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. RESULTS The MEME-ChIP web service is designed t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 3 شماره

صفحات -

تاریخ انتشار 1995

The Value of Prior Knowledge in Discovering Motifs with MEME

نویسندگان

چکیده

منابع مشابه

Mixture Model based MAP Motif Discovering

Discovering Motifs With Transcription Factor Domain Knowledge

MEME: discovering and analyzing DNA and protein sequence motifs

A sequential method for discovering probabilistic motifs in proteins.

MEME-ChIP: motif analysis of large DNA datasets

عنوان ژورنال:

اشتراک گذاری